message size
Probabilistic Latency Analysis of the Data Distribution Service in ROS 2
Lee, Sanghoon, Park, Hyung-Seok, Chae, Jiyeong, Park, Kyung-Joon
--Robot Operating System 2 (ROS 2) is now the de-facto standard for robotic communication, pairing UDP transport with the Data Distribution Service (DDS) publish-subscribe middleware. DDS achieves reliability through periodic heartbeats that solicit acknowledgments for missing samples and trigger selective retransmissions. In lossy wireless networks, the tight coupling among heartbeat period, IP fragmentation, and retransmission interval obscures end-to-end latency behavior and leaves practitioners with little guidance on how to tune these parameters. T o address these challenges, we propose a probabilistic latency analysis (PLA) that analytically models the reliable transmission process of ROS 2 DDS communication using a discrete-state approach. By systematically analyzing both middleware-level and transport-level events, PLA computes the steady-state probability distribution of unacknowledged messages and the retransmission latency. Our findings establish a theoretical basis to systematically optimize reliability, latency, and performance in wireless industrial robotics. Communication has become an increasingly critical factor in modern robotics. Conventional fixed-station robots have long relied on wired links-such as Ethernet-based fieldbuses-for control and data exchange, benefiting from their stability and low latency. For mobile robots and multi-robot systems, however, the need to cut the tether and adopt wireless communication is rapidly growing [1].
The Big Send-off: High Performance Collectives on GPU-based Supercomputers
Singh, Siddharth, Singh, Mahua, Bhatele, Abhinav
We evaluate the current state of collective communication on GPU-based supercomputers for large language model (LLM) training at scale. Existing libraries such as RCCL and Cray-MPICH exhibit critical limitations on systems such as Frontier -- Cray-MPICH underutilizes network and compute resources, while RCCL suffers from severe scalability issues. To address these challenges, we introduce PCCL, a communication library with highly optimized implementations of all-gather and reduce-scatter operations tailored for distributed deep learning workloads. PCCL is designed to maximally utilize all available network and compute resources and to scale efficiently to thousands of GPUs. It achieves substantial performance improvements, delivering 6-33x speedups over RCCL and 28-70x over Cray-MPICH for all-gather on 2048 GCDs of Frontier. These gains translate directly to end-to-end performance: in large-scale GPT-3-style training, PCCL provides up to 60% and 40% speedups over RCCL for 7B and 13B parameter models, respectively.
- Energy (0.94)
- Government > Regional Government (0.47)
Demystifying the Communication Characteristics for Distributed Transformer Models
Anthony, Quentin, Michalowicz, Benjamin, Hatef, Jacob, Xu, Lang, Abduljabbar, Mustafa, Shafi, Aamir, Subramoni, Hari, Panda, Dhabaleswar
Deep learning (DL) models based on the transformer architecture have revolutionized many DL applications such as large language models (LLMs), vision transformers, audio generation, and time series prediction. Much of this progress has been fueled by distributed training, yet distributed communication remains a substantial bottleneck to training progress. This paper examines the communication behavior of transformer models - that is, how different parallelism schemes used in multi-node/multi-GPU DL Training communicate data in the context of transformers. We use GPT-based language models as a case study of the transformer architecture due to their ubiquity. We validate the empirical results obtained from our communication logs using analytical models. At a high level, our analysis reveals a need to optimize small message point-to-point communication further, correlations between sequence length, per-GPU throughput, model size, and optimizations used, and where to potentially guide further optimizations in framework and HPC middleware design and optimization.
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- North America > United States > Texas (0.04)
- (4 more...)
A Containerized Microservice Architecture for a ROS 2 Autonomous Driving Software: An End-to-End Latency Evaluation
Betz, Tobias, Wen, Long, Pan, Fengjunjie, Kaljavesi, Gemb, Zuepke, Alexander, Bastoni, Andrea, Caccamo, Marco, Knoll, Alois, Betz, Johannes
The automotive industry is transitioning from traditional ECU-based systems to software-defined vehicles. A central role of this revolution is played by containers, lightweight virtualization technologies that enable the flexible consolidation of complex software applications on a common hardware platform. Despite their widespread adoption, the impact of containerization on fundamental real-time metrics such as end-to-end latency, communication jitter, as well as memory and CPU utilization has remained virtually unexplored. This paper presents a microservice architecture for a real-world autonomous driving application where containers isolate each service. Our comprehensive evaluation shows the benefits in terms of end-to-end latency of such a solution even over standard bare-Linux deployments. Specifically, in the case of the presented microservice architecture, the mean end-to-end latency can be improved by 5-8 %. Also, the maximum latencies were significantly reduced using container deployment.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.28)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Automobiles & Trucks (1.00)
- Transportation > Ground > Road (0.72)
- Information Technology > Robotics & Automation (0.72)
- Information Technology > Software (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Pub/Sub Message Brokers for GenAI
Saleh, Alaa, Pirttikangas, Susanna, Lovén, Lauri
In today's digital world, Generative Artificial Intelligence (GenAI) such as Large Language Models (LLMs) is becoming increasingly prevalent, extending its reach across diverse applications. This surge in adoption has sparked a significant increase in demand for data-centric GenAI models, highlighting the necessity for robust data communication infrastructures. Central to this need are message brokers, which serve as essential channels for data transfer within various system components. This survey aims to delve into a comprehensive analysis of traditional and modern message brokers, offering a comparative study of prevalent platforms. Our study considers numerous criteria including, but not limited to, open-source availability, integrated monitoring tools, message prioritization mechanisms, capabilities for parallel processing, reliability, distribution and clustering functionalities, authentication processes, data persistence strategies, fault tolerance, and scalability. Furthermore, we explore the intrinsic constraints that the design and operation of each message broker might impose, recognizing that these limitations are crucial in understanding their real-world applicability. We then leverage these insights to propose a sophisticated message broker framework -- one designed with the adaptability and robustness necessary to meet the evolving requisites of GenAI applications. Finally, this study examines the enhancement of message broker mechanisms specifically for GenAI contexts, emphasizing the criticality of developing a versatile message broker framework. Such a framework would be poised for quick adaptation, catering to the dynamic and growing demands of GenAI in the foreseeable future. Through this dual-pronged approach, we intend to contribute a foundational compendium that can guide future innovations and infrastructural advancements in the realm of GenAI data communication.
- Europe > Finland > Northern Ostrobothnia > Oulu (0.04)
- North America > United States > Texas (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Overview (1.00)
- Research Report (0.70)
- Information Technology > Services (1.00)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Architecture (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)
Learning to Cooperate and Communicate Over Imperfect Channels
Weil, Jannis, Ekinci, Gizem, Koeppl, Heinz, Meuser, Tobias
Information exchange in multi-agent systems improves the cooperation among agents, especially in partially observable settings. In the real world, communication is often carried out over imperfect channels. This requires agents to handle uncertainty due to potential information loss. In this paper, we consider a cooperative multi-agent system where the agents act and exchange information in a decentralized manner using a limited and unreliable channel. To cope with such channel constraints, we propose a novel communication approach based on independent Q-learning. Our method allows agents to dynamically adapt how much information to share by sending messages of different sizes, depending on their local observations and the channel's properties. In addition to this message size selection, agents learn to encode and decode messages to improve their jointly trained policies. We show that our approach outperforms approaches without adaptive capabilities in a novel cooperative digit-prediction environment and discuss its limitations in the traffic junction environment.
- Research Report (0.64)
- Workflow (0.46)
Federated learning compression designed for lightweight communications
Ribeiro, Lucas Grativol, Leonardon, Mathieu, Muller, Guillaume, Fresse, Virginie, Arzel, Matthieu
Federated Learning (FL) is a promising distributed method for edge-level machine learning, particularly for privacysensitive applications such as those in military and medical domains, where client data cannot be shared or transferred to a cloud computing server. In many use-cases, communication cost is a major challenge in FL due to its natural intensive network usage. Client devices, such as smartphones or Internet of Things (IoT) nodes, have limited resources in terms of energy, computation, and memory. To address these hardware constraints, lightweight models and compression techniques such as pruning and quantization are commonly adopted in centralised paradigms. In this paper, we investigate the impact of compression techniques on FL for a typical image classification task. Going further, we demonstrate that a straightforward method can compresses messages up to 50% while having less than 1% of accuracy loss, competing with state-of-the-art techniques.
Scalability of Message Encoding Techniques for Continuous Communication Learned with Multi-Agent Reinforcement Learning
Vanneste, Astrid, Somers, Thomas, Vanneste, Simon, Mets, Kevin, De Schepper, Tom, Mercelis, Siegfried, Hellinckx, Peter
Many multi-agent systems require inter-agent communication to properly achieve their goal. By learning the communication protocol alongside the action protocol using multi-agent reinforcement learning techniques, the agents gain the flexibility to determine which information should be shared. However, when the number of agents increases we need to create an encoding of the information contained in these messages. In this paper, we investigate the effect of increasing the amount of information that should be contained in a message and increasing the number of agents. We evaluate these effects on two different message encoding methods, the mean message encoder and the attention message encoder. We perform our experiments on a matrix environment. Surprisingly, our results show that the mean message encoder consistently outperforms the attention message encoder. Therefore, we analyse the communication protocol used by the agents that use the mean message encoder and can conclude that the agents use a combination of an exponential and a logarithmic function in their communication policy to avoid the loss of important information after applying the mean message encoder.
- Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)
- Asia > Middle East > Jordan (0.04)